Parallel Variable-Length Encoding on GPGPUs
نویسنده
چکیده
Variable-Length Encoding (VLE) is a process of reducing input data size by replacing fixed-length data words with codewords of shorter length. As VLE is one of the main building blocks in systems for multimedia compression, its efficient implementation is essential. The massively parallel architecture of modern general purpose graphics processing units (GPGPUs) has been successfully used for acceleration of inherently parallel compression blocks, such as image transforms and motion estimation. On the other hand, VLE is an inherently serial process due to the requirement of writing a variable number of bits for each codeword to the compressed data stream. The introduction of the atomic operations on the latest GPGPUs enables writing to the output memory locations by many threads in parallel. We present a novel data parallel algorithm for variable length encoding using atomic operations, which archives performance speedups of up to 35-50x using a CUDA-enabled GPGPU.
منابع مشابه
Using Arithmetic Coding for Reduction of Resulting Simulation Data Size on Massively Parallel GPGPUs
The popularity of parallel platforms, such as general purpose graphics processing units (GPGPUs) for large-scale simulations is rapidly increasing, however the I/O bandwidth and storage capacity of these massively-parallel cards remain the major bottle necks. We propose a novel approach for post-processing of simulation data directly on GPGPUs by efficient data size reduction immediately after ...
متن کاملLow-Power Scientific Computing
Introduction: Scientists and mathematicians are increasingly realizing the computational benefits of using modern, multi-core architectures. In response to this, manufacturers of traditional desktop graphics-processing units (GPUs) have evolved their architectures to create desktop and server GPGPUs (General Purpose Graphics Processing Units). These GPGPUs are quickly becoming the platform of c...
متن کاملA parallel CAVLC design for 4096×2160p encoder
This paper presents a high performance VLSI design of Context-Based Adaptive Variable Length-Coding (CAVLC) for 4096x2160p@60fps H.264/AVC encoder. A parallel architecture is proposed to make the scan and encode stage work simultaneously. Four coefficients are scanned in parallel, and four Levels and Run_before are coded in parallel. From experimental results, only 120 cycles at most are needed...
متن کاملEfficient Probabilistic Model Checking on General Purpose Graphics Processors
We present algorithms for parallel probabilistic model checking on general purpose graphic processing units (GPGPUs). For this purpose we exploit the fact that some of the basic algorithms for probabilistic model checking rely on matrix vector multiplication. Since this kind of linear algebraic operations are implemented very efficiently on GPGPUs, the new parallel algorithms can achieve consid...
متن کاملParallel visual data restoration on multi-GPGPUs using stencil-reduce pattern
In this paper, a highly effective parallel filter for visual data restoration is presented. The filter is designed following a skeletal approach, using a newly proposed stencil-reduce, and has been implemented by way of the FastFlow parallel programming library. As a result of its high-level design, it is possible to run the filter seamlessly on a multicore machine, on multi-GPGPUs, or on both....
متن کامل